Assignment: VAST Mini-Challenge 2

Part 3 out of 3

Yong Kai Lim https://limyongkai.netlify.app/ (Singapore Management University)
07-23-2021

Continue Investigation from Part 2….

3. Can you infer the owners of each credit card and loyalty card? What is your evidence? Where are there uncertainties in your method? Where are there uncertainties in the data? Please limit your answer to 8 images and 500 words.

In order to tag the owners of each credit card and loyalty card to the car id, we would need to combine several factors together to triangulate the results. The two conditions that will be used to triangulate the data between the three datasets are:

  1. CC transaction timestamp has to be between the time period where the car stopped moving which signify the car reaching its location and the subsequent timestamp that the car starts moving, signifying when the car left the location. By using the stationary GPS stop timestamp and the subsequent GPS timestamp, we can narrow down the selection.
  2. Car stationary GPS coordinates has to be within reasonable radius of the location coordinates.

The locations coordinates would be assigned by referencing the tourist map of Abila. However, from the earlier section, we discovered that the tourist map provided might not be accurate in locating the location coordinates as the icons on the tourist map might not represent the exact coordinates of the location.

Furthermore, the tourist map do not have all the locations marked by its logo which will not allows a full join with the locations in the cc transaction data. Table 5 shows the locations from the cc dataset whose logo could not be located visually on the tourist map of Abila. Ranking the number of transaction at each location in descending order, there are high volume of transactions at those locations and the need to map their GPS coordinate is necessary.

## Transactions on 13/01/2014 at "Frydos Autosupply n' More"
locations <- data.frame(location = cc$location) %>% 
  group_by(location) %>% summarize(number_transactions=n())
knitr::kable(locations %>% 
      dplyr::filter(location == "Abila Zacharo" |
                    location == "Brewed Awakenings" |
                    location == "Daily Dealz" |
                    location == "Hippokampos" |
                    location == "Kalami Kafenion" |
                    location == "Kronos Pipe and Irrigation" |
                    location == "Octavio's Office Supplies" |
                    location == "Shoppers' Delight" |
                    location == "Stewart and Sons Fabrication") %>%
      arrange(desc(number_transactions)), "simple",
      caption="Table of location with no traceable coordinates") 
Table 1: Table of location with no traceable coordinates
location number_transactions
Hippokampos 171
Abila Zacharo 72
Kalami Kafenion 64
Brewed Awakenings 30
Shoppers’ Delight 20
Stewart and Sons Fabrication 18
Kronos Pipe and Irrigation 6
Octavio’s Office Supplies 4
Daily Dealz 1

Figure 1 shows the map marked with blue dots representing the stationary GPS coordinate of all the cars except for each employee house. The popular locations can be determined by the frequency of the blue dots at a particular location on the map.

Cross referencing with the transactions table, the locations coordinates were tag with their corresponding coordinates by cross-referencing to the car GPS data and geo-referenced data.

## Getting coordinates of car stop positions
first_gps <- gps_stop %>% 
  group_by(id) %>% 
  filter(row_number()==1) %>% 
  ungroup(id)
gps_pts <- gps_stop %>% ungroup(id) %>%
  add_row(first_gps) %>% group_by(id) %>% arrange(timestamp) %>%
  filter(!(start_vec==1 & stop_vec==1)) %>%
  group_by(id) %>% arrange(timestamp) %>%
  mutate( start.time = ifelse(start_vec== 0 & stop_vec==0, timestamp, NA),
          start.time = ifelse(start_vec==1, timestamp,NA),
          end.time=ifelse(stop_vec==1, timestamp, NA),
          start.gps = ifelse(start_vec==0 & stop_vec==0, geometry,NA),
          start.gps = ifelse(start_vec==1, geometry,NA),
          end.gps=ifelse(stop_vec==1, geometry,NA),
          end.time = ifelse(start_vec==1, lead(end.time), end.time),
          end.gps = ifelse(start_vec==1, lead(end.gps), end.gps)) %>%
  filter(!is.na(start.time))%>%
  mutate(end.gps = ifelse(end.gps=='NULL',start.gps,end.gps),
         end.time = ifelse(is.na(end.time),start.time, end.time),
         start.time= as_datetime(start.time),
         end.time=as_datetime(end.time),
         next.start.time=lead(start.time),
         driving.time=round(difftime(end.time,start.time,units='mins'),2)) %>%
  dplyr::select(id, date, start.time, end.time, start.gps, end.gps, 
                next.start.time, driving.time) %>%
  mutate(start.gps=purrr::map(start.gps, st_point) %>% st_as_sfc(crs=4326))%>%
  mutate(end.gps=purrr::map(end.gps, st_point) %>% st_as_sfc(crs=4326)) 
car$CarID <- as_factor(car$CarID)
gps_pts <- left_join(gps_pts, car, by=c("id"="CarID"))
gps_stop_points1 <- gps_pts %>%
  mutate(time.stop = difftime(next.start.time, end.time, units=c("mins")), 
         time.stop = as.numeric(time.stop))%>% 
  filter(time.stop < 300) %>% 
  dplyr::select(id, start.time, start.gps)

## Generate map with the stop positions in blue dots
tmap_mode("view")
map_POI<-tm_shape(bgmap) +
  tm_rgb(bgmap, r=1, g=2, b=3, alpha=NA, saturation=1, 
         interpolate=TRUE, max.value=255) +
  tm_shape(gps_stop_points1)+
  tm_dots(col="blue", shape=30,id="id",
          popup.vars=c("Car ID"="id", 
                       "Stationary timestamp" = "start.time", 
                       "GPS:"="start.gps"))
tmap_leaflet(map_POI)

Figure 1: GPS stationary locations

The car id are triangulated by tabulating the centroid coordinates of the GPS data from the stationary GPS stop locations from the map. However, there are few limitations by using the methodology mentioned earlier for tagging the owners.

  1. In the earlier section, 4 coffee shops were discovered whose cc transactions timestamp were all at 12:00 but the actual visit time by the employees were in the morning. The inaccuracy of the cc transactions timestamp made it impossible to tag them to the car GPS data.
  2. In the earlier section, the distance between car id 24 GPS stationary coordinates at Frydos Autosupply n’ More was 500 metres away. Locations might not have their dedicated carpark right next to them and some car owners are able to get on foot after parking at a nearby carpark. Hence, the maximum distance of the car stop position to the location coordinates will be set at less than 500 metres, a reasonable distance for traveling on foot.
  3. Employees might not drive their issued car out when they perform the transaction using their cc. Examples could be car pooling for a meal or using their personal vehicles when making the transactions. This will result in a incomplete tagging of the car id GPS to the transaction data.

The interactive heatmap in Figure 2 shows the percentage that were successfully match with the car GPS and cc transaction data by the conditions mentioned earlier. The histogram was also plotted to visualise the distribution of the result. From the two visualisation, we observed that the methodology yield some high percentage match for the car id owner with the cc owner.

# Tagging location coordinates
location_tag <- data.frame(location = c(locations$location,"GAStech"),
 long =c(centroid(rbind(c(24.82590612, 36.05102229),c(24.82591819, 36.05092013),c(24.82598413, 36.05097547)))[1],
         centroid(rbind(c(24.84592966, 36.07443715),c(24.84598782, 36.07434876),c(24.84595026, 36.07437836)))[1],
         centroid(rbind(c(24.85097804, 36.06349268),c(24.85099445, 36.06342076),c(24.85103178, 36.06348173)))[1],
         centroid(rbind(c(24.87617634, 36.07713037),c(24.87621582, 36.07713598),c(24.87619872, 36.07715385)))[1],
         centroid(rbind(c(24.85626503, 36.07529323),c(24.85631411, 36.07523202),c(24.85634841, 36.07528136)))[1],
         centroid(rbind(c(24.85089145, 36.08172086),c(24.85096025, 36.08176242),c(24.85087799, 36.08180554)))[1],
         centroid(rbind(c(24.90119998, 36.05402165),c(24.90128202, 36.05408823),c(24.90116585, 36.05411015)))[1],
         NA,
         centroid(rbind(c(24.88089399, 36.05851786),c(24.88092086, 36.05858619),c(24.8808655, 36.05856303)))[1],
         centroid(rbind(c(24.8951996, 36.07073983),c(24.89517891, 36.07062423),c(24.89526281, 36.07069274)))[1],                         
         centroid(rbind(c(24.88983886, 36.05469486),c(24.88978433, 36.05463184),c(24.88977321, 36.05467589)))[1],
         centroid(rbind(c(24.86416839, 36.07332041),c(24.86417651, 36.07336116),c(24.86419582, 36.07332868)))[1],
         NA,
         centroid(rbind(c(24.86068835, 36.08962196),c(24.86068191, 36.08954231),c(24.8607611, 36.08960361)))[1],
         centroid(rbind(c(24.84132949, 36.07213193),c(24.84134818, 36.07212045),c(24.4134819, 36.07212044)))[1],
         centroid(rbind(c(24.905573, 36.06044638),c(24.90561679, 36.06033304),c(24.90568587, 36.06040053)))[1],
         centroid(rbind(c(24.85804364, 36.05970763),c(24.8580772, 36.05975308),c(24.8579808, 36.05976284)))[1],
         centroid(rbind(c(24.85804364, 36.05970763),c(24.8580772, 36.05975308),c(24.8579808, 36.05976284)))[1],
         centroid(rbind(c(24.90096913, 36.05842562),c(24.90107066, 36.05844726),c(24.90097455, 36.05850897)))[1],
         centroid(rbind(c(24.88586605, 36.063639),c(24.88595361, 36.06364584),c(24.88586737, 36.06371539)))[1],
         centroid(rbind(c(24.85756422, 36.07660977),c(24.85763811, 36.07664766),c(24.857573, 36.07669909)))[1],
         centroid(rbind(c(24.87330651, 36.06751231),c(24.87335583, 36.06750587),c(24.87333867, 36.06755141)))[1],                    
         centroid(rbind(c(24.85237319, 36.06582037),c(24.85241027, 36.06582475),c(24.85237372, 36.06584816)))[1],
         centroid(rbind(c(24.89986767, 36.05442391),c(24.89996154, 36.05448329),c(24.89987365, 36.05453273)))[1],
         centroid(rbind(c(24.84983351, 36.06587998),c(24.84983936, 36.06582196),c(24.8497762, 36.06583535)))[1],
         NA,
         centroid(rbind(c(24.88551872, 36.05840982),c(24.88542068, 36.0584603),  c(24.88553455, 36.05844325)))[1],
         centroid(rbind(c(24.83307421, 36.0653098),c(24.83314028, 36.06523446),  c(24.84143955, 36.06403449),c(24.84141463, 36.06410072)))[1],
         NA,
         centroid(rbind(c(24.87077341, 36.05196196),c(24.87081903, 36.05192066),c(24.87083665, 36.05197804)))[1],
         centroid(rbind(c(24.85227441, 36.06324941),c(24.85226894, 36.06330479),c(24.8523291, 36.0632684)))[1],
         NA,NA,
         centroid(rbind(c(24.87148791, 36.06774029),c(24.8714995, 36.06774623),c(24.87149104, 36.06776587)))[1],
         centroid(rbind(c(24.87956897, 36.04802112),c(24.8795714, 36.04804908),  c(24.8795745, 36.0480309)))[1]),
 lat = c(centroid(rbind(c(24.82590612, 36.05102229),c(24.82591819, 36.05092013),c(24.82598413, 36.05097547)))[2],
         centroid(rbind(c(24.84592966, 36.07443715),c(24.84598782, 36.07434876),c(24.84595026, 36.07437836)))[2],
         centroid(rbind(c(24.85097804, 36.06349268),c(24.85099445, 36.06342076),c(24.85103178, 36.06348173)))[2],
         centroid(rbind(c(24.87617634, 36.07713037),c(24.87621582, 36.07713598),c(24.87619872, 36.07715385)))[2],
         centroid(rbind(c(24.85626503, 36.07529323),c(24.85631411, 36.07523202),c(24.85634841, 36.07528136)))[2],
         centroid(rbind(c(24.85089145, 36.08172086),c(24.85096025, 36.08176242),c(24.85087799, 36.08180554)))[2],
         centroid(rbind(c(24.90119998, 36.05402165),c(24.90128202, 36.05408823),c(24.90116585, 36.05411015)))[2],
         NA,
         centroid(rbind(c(24.88089399, 36.05851786),c(24.88092086, 36.05858619),c(24.8808655, 36.05856303)))[2],
         centroid(rbind(c(24.8951996, 36.07073983),c(24.89517891, 36.07062423),c(24.89526281, 36.07069274)))[2],
         centroid(rbind(c(24.88983886, 36.05469486),c(24.88978433, 36.05463184),c(24.88977321, 36.05467589)))[2],
         centroid(rbind(c(24.86416839, 36.07332041),c(24.86417651, 36.07336116),c(24.86419582, 36.07332868)))[2],
         NA,
         centroid(rbind(c(24.86068835, 36.08962196),c(24.86068191, 36.08954231),c(24.8607611, 36.08960361)))[2],
         centroid(rbind(c(24.84132949, 36.07213193),c(24.84134818, 36.07212045),c(24.4134819, 36.07212044)))[2],
         centroid(rbind(c(24.905573, 36.06044638),c(24.90561679, 36.06033304),c(24.90568587, 36.06040053)))[2],
         centroid(rbind(c(24.85804364, 36.05970763),c(24.8580772, 36.05975308),c(24.8579808, 36.05976284)))[2],
         centroid(rbind(c(24.85804364, 36.05970763),c(24.8580772, 36.05975308),c(24.8579808, 36.05976284)))[2],
         centroid(rbind(c(24.90096913, 36.05842562),c(24.90107066, 36.05844726),c(24.90097455, 36.05850897)))[2],
         centroid(rbind(c(24.88586605, 36.063639),c(24.88595361, 36.06364584),c(24.88586737, 36.06371539)))[2],
         centroid(rbind(c(24.85756422, 36.07660977),c(24.85763811, 36.07664766),c(24.857573, 36.07669909)))[2],
         centroid(rbind(c(24.87330651, 36.06751231),c(24.87335583, 36.06750587),c(24.87333867, 36.06755141)))[2],
         centroid(rbind(c(24.85237319, 36.06582037),  c(24.85241027, 36.06582475),c(24.85237372, 36.06584816)))[2],
         centroid(rbind(c(24.89986767, 36.05442391),c(24.89996154, 36.05448329),  c(24.89987365, 36.05453273)))[2],
         centroid(rbind(c(24.84983351, 36.06587998),c(24.84983936, 36.06582196),c(24.8497762, 36.06583535)))[2],
          NA,
         centroid(rbind(c(24.83307421, 36.0653098),c(24.83314028, 36.06523446),  c(24.84143955, 36.06403449),c(24.84141463, 36.06410072)))[1],
         centroid(rbind(c(24.88551872, 36.05840982),c(24.88542068, 36.0584603),  c(24.88553455, 36.05844325)))[2],
          NA,
         centroid(rbind(c(24.87077341, 36.05196196),c(24.87081903, 36.05192066),c(24.87083665, 36.05197804)))[2],
         centroid(rbind(c(24.85227441, 36.06324941),c(24.85226894, 36.06330479),c(24.8523291, 36.0632684)))[2],
         NA,NA,
         centroid(rbind(c(24.87148791, 36.06774029),c(24.8714995, 36.06774623),c(24.87149104, 36.06776587)))[2],
         centroid(rbind(c(24.87956897, 36.04802112),c(24.8795714, 36.04804908),  c(24.8795745, 36.0480309)))[2]))
location_tag <- location_tag %>% na.omit()
location_tag <- st_as_sf(location_tag, coords=c("long","lat"), crs=4326)

## join GPS data with transaction data with location coordinates
final_trans_gps <- inner_join(final_trans_1, location_tag, by=c("location")) %>%
  rename(loc.coord=geometry)
## Join with car GPS and tag the location to car gps
gps_match <- final_trans_gps %>% 
  left_join(gps_pts , by=c("date"))%>% 
  group_by(last4ccnum) %>% arrange(datetime) %>%
  filter(datetime > end.time & datetime <= next.start.time + minutes(30)) %>%
  mutate(diff.dist = st_distance(loc.coord, end.gps, by_element=TRUE),
         diff.dist = as.numeric(diff.dist)) %>%
  filter(diff.dist <500)
tagging <-gps_match %>%group_by(last4ccnum, id)%>%
  summarize(tag=n()) %>% arrange(desc(tag))
## Get total count of transactions minus the 4 locations per cc num
trans_collapse <- cc %>% mutate(last4ccnum=as_factor(last4ccnum)) %>% 
  filter(!(location %in% c("Bean There Done That",
                           "Brewed Awakenings",
                           "Coffee Shack",
                           "Jack's Magical Beans"))) %>%
  group_by(last4ccnum) %>% summarize(total=n())
## Limit to top 3 match only by percentage
tagging_cc_gps <- left_join(tagging, trans_collapse, by=c("last4ccnum")) %>%
  mutate(percent=round(tag/total*100,2))

tag_plot<-ggplot(tagging_cc_gps, aes(x=id, y=last4ccnum,fill=percent))+
  geom_tile() + scale_fill_gradient(low="sienna1", high="navyblue") +
  xlab("Car ID") +ylab("CC last 4 number")+ 
  labs(fill="% match")
histogram<-ggplot(tagging_cc_gps,aes(percent))+geom_histogram(binwidth=5)+
  stat_function(fun=dnorm,aes(color="red"),
                args=list(mean=mean(tagging_cc_gps$percent),
                sd=sd(tagging_cc_gps$percent)))
ggplotly(tag_plot) %>% layout(hoverlabel=list(bgcolor="white"))

Figure 2: Car GPS tagging to CC number

ggplotly(histogram) %>% layout(hoverlabel=list(bgcolor="white"))

Figure 2: Car GPS tagging to CC number

Hence, we can confidently infer that matches over 75% will be accurate. However, as there are more cc owners (55 unique owners) than car owners (35 unique car id) and the truck drivers share vehicles (5 unique truck id), we will drop the truck drivers with car id of 100 and above. Observation of the heatmap in figure 2 reveals that car id 23, car id 29 and car id 30 has matches of more than one cc number and car id 28 does not have a match with more than 75%.

From Table 6, we observe that car id 23 matches to three unique cc number with matches over 75%. The highest percentage match to cc 3484 at 91.43% shows high probability for inference, hence the observation that matches to cc 8202 and 8411 will be dropped.

For car id 29 and 30, the matches to cc number percentage are relatively high and defers less than 10%. Further investigation on the GPS map location will be performed to verify which match to retain.

## Get the match of car id to cc last4ccnum
tagging <- tagging_cc_gps %>% mutate(id=as.character(id), id=as.numeric(id)) %>% 
  filter(percent>=75 & id<100)
knitr::kable(tagging %>% filter(id==23 | id==29 | id==30) %>% 
               arrange(id), "simple",
      caption="Table of employees record and their cc and loyalty number")
Table 2: Table of employees record and their cc and loyalty number
last4ccnum id tag total percent
3484 23 31 35 88.57
8202 23 25 33 75.76
8411 23 25 32 78.12
3547 29 18 20 90.00
5921 29 13 14 92.86
6901 30 31 37 83.78
8202 30 25 33 75.76
final_tagging <- tagging %>% 
  filter(!(last4ccnum==8202 & id==23), !(last4ccnum==8411 & id ==23))

Investigation of car id 28 low cc transactions matches was visualised in Figure 3 and it revealed that the GPS coordinates of car id 28 has lots of noise. The noise in the GPS line caused a wider spread of GPS line in the visualisation on the map and also zig-zag incoherent GPS path. This most probably signifies a faulty GPS signal on the car.

Secondly, we observe that the stop position was not accurate. For example, the frequency of GPS stop coordinates at the extreme south of the map should be at GAStech. Hence, the GPS stop coordinates seems to deviate in the North-West direction. The most probable explanation will be a faulty GPS system since the GPS points were noisy and were not correctly geo-referenced on the map.

## Map geometry for original car id 28 data
gps_path5 <- gps_sf %>%
  filter(id==28) %>% 
  group_by(id) %>% 
  summarize(m = mean(timestamp), do_union=FALSE) %>% st_cast("LINESTRING")
gps_28_points <- gps_stop %>% filter(id ==28)

## Plot interactive map
tmap_mode("view")
map5<-tm_shape(bgmap) +
  tm_rgb(bgmap, r=1, g=2, b=3, alpha=NA, saturation=1, 
         interpolate=TRUE, max.value=255) +
  tm_shape(gps_path5)+
  tm_lines() +
  tm_shape(gps_28_points)+
  tm_dots(col="blue")
tmap_leaflet(map5)

Figure 3: Original GPS for car id 28

After re-calibrating the GPS coordinates for car id 28, Figure 4 shows the GPS movement data for car id 28. With the re-calibrated GPS data, we would match it with the cc transaction data to infer which cc belongs to car id 28.

From the map in Figure 4, the unqiue observation was that car id 28 visited Ahaggo Museum on the 18th and 19th of Jan and frequently patronise Jack’s Magical Beans and Ouzeri Elian over the two weeks.

From the cc transaction table, a search of Ahaggo Museum revealed that cc 1286, 7384 and 9241 made transactions on the 18th and 19th of Jan. Next, a search of Jack’s Magical Beans shows that only cc 9241 out of the three cc made transactions at the location. Lastly, a search of Ouzeri Elian on the datatable reveals that cc 9241 made 6 transactions at the location. Hence, we are confident to infer that car id 28 is the owner of cc 9241.

## Map geometry for re-calibrated Car id 28
gps28 <- gps %>% filter(id==28) %>% 
  mutate(long = long +0.005,
         lat=lat-0.002)
gps_sf28 <- st_as_sf(gps28, coords=c("long","lat"), crs=4326)
gps_path28 <- gps_sf28 %>% group_by(id) %>% 
  summarize(m = mean(timestamp), do_union=FALSE) %>% st_cast("LINESTRING")
gps28_pt <- gps_sf28 %>% 
  group_by(id) %>% arrange(timestamp) %>%
  mutate(start_diff= as.numeric(timestamp - lag(timestamp,default=first(timestamp)))/60,
         stop_diff= as.numeric(lead(timestamp)-timestamp)/60,
         date = as.Date(timestamp)) %>%
  rename(gps.coord=geometry) %>% 
  filter(start_diff>5 | stop_diff >5) %>% 
  mutate(start_vec=ifelse(start_diff>5,1,0), stop_vec=ifelse(stop_diff>5,1,0))

## Plot interactive map
tmap_mode("view")
map6<-tm_shape(bgmap) +
  tm_rgb(bgmap, r=1, g=2, b=3, alpha=NA, saturation=1, 
         interpolate=TRUE, max.value=255) +
  tm_shape(gps_path28)+
  tm_lines() +
  tm_shape(gps28_pt)+
  tm_dots(col="blue")
tmap_leaflet(map6)

Figure 4: Re-calibrated GPS for car id 28

final_tagging <- final_tagging %>% 
  dplyr::select(last4ccnum, id) %>% 
  mutate(last4ccnum = as.character(last4ccnum),
         id = as.character(id)) %>% 
  bind_rows(c(last4ccnum="9241", id="28"))

Next, we will focus on car id 29 where it matches 90% of cc 3547 transactions and 100% of cc 5921. The high proportion of matches on both credit card warrants some investigation into the data.

Looking at table 7 for both cc number, we observe that cc 3547 has transactions between 12/01/2014 to 19/01/2014 and cc 5921 has transactions between 06/01/2014 to 10/01/2014. Cross-referencing the GPS data for car id 29 in Figure 5, we can observe that the cc transactions matches the GPS data of car id 29. A possible deduction is that the owner of car id 29 used both cc card as there was no overlap in the transaction dates for both cc. Possible scenario could be that the owner switch the CC from 5921 to 3547 after 10/01/2014. However, there might be missing data on 11/01/2014 where it was not captured on both cc. Hence, we will tag car id 29 to both cc 5921 and 3547.

gps_path29 <- gps_sf %>% 
  filter(id==29) %>% 
  group_by(id) %>% 
  summarize(m = mean(timestamp), do_union=FALSE) %>% st_cast("LINESTRING")
gps29_pt <- gps_stop_points1 %>% filter(id==29)
tmap_mode("view")
map7<-tm_shape(bgmap) +
  tm_rgb(bgmap, r=1, g=2, b=3, alpha=NA, saturation=1, 
         interpolate=TRUE, max.value=255) +
  tm_shape(gps_path29)+
  tm_lines() +
  tm_shape(gps29_pt)+
  tm_dots(col="blue")
cc3547 <- cc %>% filter(last4ccnum==3547) %>% dplyr::select(-datetime, -date)
cc5921 <- cc %>% filter(last4ccnum==5921) %>% dplyr::select(-datetime, -date)
knitr::kable(list(cc3547,cc5921),caption="Transactions for CC 3547 & 5921")
Table 3: Transactions for CC 3547 & 5921
timestamp location price last4ccnum
01/12/2014 16:08 Shoppers’ Delight 51.50 3547
01/12/2014 20:11 Katerina’s Café 67.14 3547
01/13/2014 07:40 Coffee Cameleon 19.93 3547
01/13/2014 13:52 Katerina’s Café 29.55 3547
01/13/2014 19:50 Katerina’s Café 89.83 3547
01/14/2014 07:37 Coffee Cameleon 9.80 3547
01/14/2014 13:41 Katerina’s Café 75.46 3547
01/14/2014 20:17 Katerina’s Café 36.95 3547
01/15/2014 07:51 Coffee Cameleon 14.47 3547
01/15/2014 13:58 Abila Zacharo 33.80 3547
01/15/2014 21:21 Katerina’s Café 27.48 3547
01/16/2014 07:38 Coffee Cameleon 67.19 3547
01/16/2014 13:27 Abila Zacharo 31.31 3547
01/16/2014 19:47 Katerina’s Café 34.34 3547
01/17/2014 07:38 Coffee Cameleon 10.27 3547
01/17/2014 13:42 Katerina’s Café 21.01 3547
01/18/2014 13:34 Ouzeri Elian 25.75 3547
01/18/2014 15:31 General Grocer 477.60 3547
01/18/2014 19:53 Katerina’s Café 76.10 3547
01/19/2014 18:54 Katerina’s Café 72.25 3547
timestamp location price last4ccnum
01/06/2014 07:49 Coffee Cameleon 8.39 5921
01/06/2014 13:48 Ouzeri Elian 30.87 5921
01/06/2014 20:33 Katerina’s Café 15.52 5921
01/07/2014 07:46 Coffee Cameleon 9.10 5921
01/07/2014 13:54 Gelatogalore 88.97 5921
01/07/2014 20:32 Katerina’s Café 19.53 5921
01/08/2014 07:52 Coffee Cameleon 12.26 5921
01/08/2014 13:29 Kalami Kafenion 24.08 5921
01/08/2014 20:42 Katerina’s Café 92.83 5921
01/09/2014 07:37 Coffee Cameleon 19.99 5921
01/09/2014 14:13 Guy’s Gyros 17.44 5921
01/09/2014 19:30 Katerina’s Café 26.60 5921
01/10/2014 07:47 Coffee Cameleon 11.54 5921
01/10/2014 19:56 Katerina’s Café 21.89 5921
tmap_leaflet(map7)

Figure 5: GPS for car id 29

Lastly, we will look at car id 30 with cc 6901 and 8202. The GPS data for car id 30 was visualise in Figure 6 and the transaction from cc 6901 and 8202 in table 8.

Comparing the GPS data map and cc translation data, we focused on locations with a lower frequency of visit and locations in a less congested area for easier verification. From the 3 locations and transaction details below, we can deduce that cc 6901 matches car id 30.

  1. GPS data showed a visit to Ouzeri Elian on 07/01/2014 and only cc 6901 has matching transaction.
  2. GPS data showed visits to Kalami Kafenion on 15/01/2014 and 18/01/2014 and only cc 6901 has matching transaction for both days.
  3. GPS data showed visits to Hippokampos on 10/01/2014 and 14/01/2014 and only cc 6901 has matching transaction for both days.
gps_path_30 <- gps_sf %>% 
  filter(id==30) %>% 
  group_by(id) %>%
  summarize(m = mean(timestamp), do_union=FALSE) %>% st_cast("LINESTRING")
gps_stop_points30 <- gps_pts %>%
  mutate(time.stop = difftime(next.start.time, end.time, units=c("mins")), 
         time.stop = as.numeric(time.stop))%>% 
  filter(time.stop < 300 & id==30) %>% 
  dplyr::select(id, start.time, start.gps)

## Plot interactive map
tmap_mode("view")
map8<-tm_shape(bgmap) +
  tm_rgb(bgmap, r=1, g=2, b=3, alpha=NA, saturation=1, 
         interpolate=TRUE, max.value=255) +
  tm_shape(gps_path_30) +
  tm_lines(col ="red") +
  tm_shape(gps_stop_points30)+
  tm_dots(col="blue", shape=30)
tmap_leaflet(map8)

Figure 6: GPS for car id 30

cc6901 <- cc %>% filter(last4ccnum==6901) %>% dplyr::select(-datetime, -date)
cc8202 <- cc %>% filter(last4ccnum==8202) %>% dplyr::select(-datetime, -date)
knitr::kable(list(cc6901,cc8202),caption="Transactions for CC 6901 & 8202")
Table 4: Transactions for CC 6901 & 8202
timestamp location price last4ccnum
01/06/2014 08:07 Brew’ve Been Served 5.66 6901
01/06/2014 14:17 Katerina’s Café 19.65 6901
01/06/2014 20:09 Guy’s Gyros 11.94 6901
01/07/2014 08:18 Brew’ve Been Served 47.74 6901
01/07/2014 14:09 Ouzeri Elian 59.51 6901
01/07/2014 20:20 Frydos Autosupply n’ More 312.73 6901
01/08/2014 08:03 Brew’ve Been Served 10.15 6901
01/08/2014 13:51 Abila Zacharo 30.85 6901
01/08/2014 20:57 Guy’s Gyros 10.28 6901
01/09/2014 07:58 Brew’ve Been Served 19.47 6901
01/09/2014 13:56 Guy’s Gyros 8.87 6901
01/09/2014 20:20 Frydos Autosupply n’ More 31.24 6901
01/10/2014 08:01 Brew’ve Been Served 5.31 6901
01/10/2014 13:58 Hippokampos 39.89 6901
01/10/2014 20:09 Guy’s Gyros 29.81 6901
01/11/2014 14:19 Abila Zacharo 45.20 6901
01/11/2014 20:26 Frydos Autosupply n’ More 261.00 6901
01/12/2014 13:31 Guy’s Gyros 34.74 6901
01/12/2014 16:27 Ahaggo Museum 120.20 6901
01/13/2014 08:21 Brew’ve Been Served 13.19 6901
01/13/2014 14:13 Guy’s Gyros 12.76 6901
01/13/2014 20:45 Shoppers’ Delight 144.40 6901
01/14/2014 08:13 Brew’ve Been Served 12.31 6901
01/14/2014 13:57 Hippokampos 17.18 6901
01/14/2014 20:43 Frydos Autosupply n’ More 146.74 6901
01/15/2014 08:14 Brew’ve Been Served 18.58 6901
01/15/2014 14:13 Kalami Kafenion 28.82 6901
01/16/2014 08:03 Brew’ve Been Served 16.67 6901
01/16/2014 13:55 Abila Zacharo 8.43 6901
01/16/2014 20:09 Guy’s Gyros 28.27 6901
01/17/2014 08:17 Brew’ve Been Served 5.29 6901
01/17/2014 13:55 Guy’s Gyros 32.31 6901
01/17/2014 19:46 Guy’s Gyros 16.83 6901
01/18/2014 14:17 Kalami Kafenion 53.36 6901
01/18/2014 20:07 General Grocer 108.49 6901
01/19/2014 14:20 Abila Zacharo 47.80 6901
01/19/2014 20:51 Guy’s Gyros 39.60 6901
timestamp location price last4ccnum
01/06/2014 08:17 Brew’ve Been Served 15.39 8202
01/06/2014 13:58 Hippokampos 38.25 8202
01/06/2014 20:12 Frydos Autosupply n’ More 80.85 8202
01/07/2014 07:58 Brew’ve Been Served 17.40 8202
01/07/2014 13:57 Katerina’s Café 37.44 8202
01/07/2014 20:13 Katerina’s Café 65.02 8202
01/08/2014 08:01 Brew’ve Been Served 3.92 8202
01/08/2014 13:42 Kalami Kafenion 22.49 8202
01/08/2014 20:35 Katerina’s Café 16.93 8202
01/09/2014 07:56 Brew’ve Been Served 98.25 8202
01/09/2014 14:09 Guy’s Gyros 27.69 8202
01/09/2014 20:22 Katerina’s Café 29.82 8202
01/10/2014 08:17 Brew’ve Been Served 8.47 8202
01/10/2014 14:11 Gelatogalore 32.19 8202
01/10/2014 20:02 Frydos Autosupply n’ More 43.65 8202
01/11/2014 20:06 Katerina’s Café 52.45 8202
01/12/2014 20:43 Frydos Autosupply n’ More 161.96 8202
01/13/2014 08:23 Brew’ve Been Served 19.89 8202
01/13/2014 14:00 Gelatogalore 36.24 8202
01/14/2014 07:53 Brew’ve Been Served 9.53 8202
01/14/2014 14:16 Hippokampos 16.73 8202
01/14/2014 20:42 Katerina’s Café 46.61 8202
01/15/2014 08:06 Brew’ve Been Served 3.47 8202
01/15/2014 13:46 Guy’s Gyros 16.58 8202
01/15/2014 20:26 Katerina’s Café 61.61 8202
01/16/2014 08:08 Brew’ve Been Served 90.33 8202
01/16/2014 13:45 Kalami Kafenion 9.27 8202
01/16/2014 20:39 Katerina’s Café 30.56 8202
01/17/2014 08:09 Brew’ve Been Served 9.30 8202
01/17/2014 14:00 Guy’s Gyros 15.12 8202
01/17/2014 20:19 Katerina’s Café 36.12 8202
01/18/2014 14:02 Kalami Kafenion 42.73 8202
01/18/2014 19:46 Katerina’s Café 11.19 8202
final_tagging <- final_tagging %>% ungroup() %>% 
  filter(!(last4ccnum=="8202"& id=="30")) %>% 
  mutate(id=as_factor(id)) %>% 
  left_join(car, by=c("id"="CarID")) %>% 
  mutate(name=paste(LastName,FirstName))

The tagging of all 35 car owners (excluding truck drivers) have been completed and verified.

4. Given the data sources provided, identify potential informal or unofficial relationships among GASTech personnel. Provide evidence for these relationships.

To visusalise potential relationships relationships, network analysis was used to look at the relationships. Figure 7 shows an interactive network analysis of each car ID employee and the locations that they made transactions at with their GAStech cc. From the network analysis throughout the two weeks of data, we can uncover some relationships among employees.

cc_data <- cc %>% mutate(day=lubridate::day(datetime), hour=lubridate::hour(datetime))
sources <- cc_data %>% mutate(hour=lubridate::hour(datetime)) %>% 
  distinct(last4ccnum) %>% left_join(final_tagging, by=c("last4ccnum")) %>% 
  mutate(name=paste(LastName,FirstName)) %>% 
  rename(label = name) %>% drop_na(id) %>%
  mutate(CurrentEmploymentType=ifelse(is.na(CurrentEmploymentType),"Driver",CurrentEmploymentType))
destinations <- cc_data  %>% 
  distinct(location) %>%
  rename(label = location)
cc_nodes <- full_join(sources, 
                      destinations, 
                      by = "label") %>% rename(car_id=id)
cc_nodes <- cc_nodes %>% 
  rowid_to_column("id") %>%
  mutate(CurrentEmploymentType=ifelse(is.na(CurrentEmploymentType),
                                      "Locations",CurrentEmploymentType),
         title=label) %>% 
  rename(group=CurrentEmploymentType)
edges <- cc_data %>% 
  mutate(last4ccnum = as.character(last4ccnum)) %>%  
  filter(last4ccnum %in% final_tagging$last4ccnum) %>% 
  group_by(last4ccnum, location, day, hour) %>%
  summarise(weight = n()) %>% 
  ungroup()
cc_edges <- edges %>% 
  inner_join(cc_nodes,by = c("last4ccnum")) %>% 
  rename(from = id)
cc_edges <- cc_edges %>% 
  inner_join(cc_nodes,by = c("location" = "label")) %>% 
  rename(to = id) %>% 
  dplyr::select(from, to,day, hour, weight) %>% 
  mutate(time_bin = case_when(hour>=0&hour<6~"Midnight",
                              hour>=6&hour<12~"Morning",
                              hour>=12&hour<18~"Afternoon",
                              hour>=18~"Night"),
         weekday.weekend = ifelse(day %in% c(11,12,18,19),"Weekend","Weekday"),
         day.week = case_when(day==6|day==13~"Monday",
                              day==7|day==14~"Tuesday",
                              day==8|day==15~"Wednesday",
                              day==9|day==16~"Thursday",
                              day==10|day==17~"Friday",
                              day==11|day==18~"Saturday",
                              day==12|day==19~"Sunday",))

visNetwork(cc_nodes, cc_edges, main="Network analysis by location and employee") %>% 
  visIgraphLayout(layout = "layout_on_grid") %>% 
  visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE) %>% 
  visLegend()

Figure 7: Network analysis by location and employee

  1. Desafio Golf Course was visited by GAStech Executives only. Based on figure ??, we observed that there are only cc transactions made on Sunday at the location. Hence, we can infer that all five executive of GAStech might have some after working hours relationship by gathering at the Desafio Golf Course on both Sundays. They might be playing golf or a regular gathering at the location.
  2. Chostus Hotel was visited by Orilla Elsa, Tempestad Brand and Sanjorge Jr. Sten throughout the 2 weeks of data. Table 9 below shows the transaction at Chostus Hotel only and we can observe that Orilla Elsa and Tempestad Brand made transactions on 4 separate dates during lunch. The transactions were relatively expensive for a lunch meal in comparison to other food and beverage location in Abila. Alternatively, they might have paid for a hotel room during their visit to the location. Furthermore, both of them are from the same department with the same title in GAStech and there might be some relationship between them.
knitr::kable(cc %>% mutate(last4ccnum=as.character(last4ccnum)) %>% 
               left_join(final_tagging, by=c("last4ccnum")) %>% 
               filter(location=="Chostus Hotel") %>% 
               select(name, CurrentEmploymentType, CurrentEmploymentTitle,
                      location, timestamp, price),
             caption="Table of transaction at Chostus Hotel")
Table 5: Table of transaction at Chostus Hotel
name CurrentEmploymentType CurrentEmploymentTitle location timestamp price
Orilla Elsa Engineering Drill Technician Chostus Hotel 01/08/2014 12:56 107.51
Tempestad Brand Engineering Drill Technician Chostus Hotel 01/08/2014 13:19 111.89
Tempestad Brand Engineering Drill Technician Chostus Hotel 01/10/2014 13:08 133.25
Orilla Elsa Engineering Drill Technician Chostus Hotel 01/10/2014 13:11 197.41
Orilla Elsa Engineering Drill Technician Chostus Hotel 01/14/2014 13:17 109.54
Tempestad Brand Engineering Drill Technician Chostus Hotel 01/14/2014 13:21 113.08
Tempestad Brand Engineering Drill Technician Chostus Hotel 01/17/2014 13:49 114.22
Orilla Elsa Engineering Drill Technician Chostus Hotel 01/17/2014 13:54 159.62
NA NA NA Chostus Hotel 01/18/2014 12:03 600.00
  1. Bean There Done That location had only transactions made by the engineering department (yellow nodes in figure 7). Bean There Done That is the furthest location from GAStech but a certain group of customer still visits and purchase from them. Visualising the GPS stationary data for the 7 customers from the engineering team in figure 8, we observe that 5 out of 7 of the customers resides in the area of Carnero Street and Parla Park whereas the remaining 2 customers, Frente Birgitta and Dedos Lidelse resides between Arkadiou Park and Sannan Park. The 2 customers residential location are at the same coordinates yet far away from Bean There Done That. However, they still patronise and purchase from there might signify some relationship between both of them.
bean_cust <- final_tagging %>% filter(name == "Frente Birgitta"|
                                      name == "Calzas Axel"|
                                      name == "Frente Vira"|
                                      name == "Azada Lars"|
                                      name == "Balas Felix"|
                                      name == "Dedos Lidelse"|
                                      name == "Cazar Gustav")
gps_stop_points_bean <- gps_pts %>%
  filter(id %in% bean_cust$id) %>% 
  mutate(time.stop = difftime(next.start.time, end.time,units=c("mins")), 
         time.stop = as.numeric(time.stop),
         name=paste(LastName,FirstName))%>% 
  filter(time.stop < 300 ) %>% 
  dplyr::select(id, start.time, start.gps,name) %>% 
  mutate(id=as.character(id))

## Plot interactive map
tmap_mode("view")
map_bean<-tm_shape(bgmap) +
  tm_rgb(bgmap, r=1, g=2, b=3, alpha=NA, saturation=1, 
         interpolate=TRUE, max.value=255) +
  tm_shape(gps_stop_points_bean)+
  tm_dots(col="name",palette="Dark2",id="start.time")

tmap_leaflet(map_bean)

Figure 8: Stationary GPS points of Bean There Done That customers

To investigate non-official relationships, we will focus on after working hours transactions. The network analysis was drilled down to transactions performed on Weekday Nights only and dining locations that had transactions in the afternoon or night to reduce cluttering of the network analysis. Figure 9 shows the network analysis for weekday nights transactions only. The edge line connecting the employees to location are colored by day to visualize if any group of employees visited a particular location on the same day in the night.

sources <- cc_data %>% mutate(hour=lubridate::hour(datetime)) %>% 
  distinct(last4ccnum) %>% left_join(final_tagging, by=c("last4ccnum")) %>% 
  mutate(name=paste(LastName,FirstName)) %>% 
  rename(label = name) %>% drop_na(id) %>% 
  mutate(CurrentEmploymentType=ifelse(is.na(CurrentEmploymentType),
                                      "Driver",CurrentEmploymentType))
destinations <- cc_data  %>% 
  filter(location =="Ouzeri Elian"|
         location=="Guy's Gyros"|
         location=="Katerina's Cafe"|
         location=="Hippokampos"|
         location=="Abila Zacharo"|
         location=="Gelatogalore"|
         location=="Kalami Kafenion"|
         location=="Chostus Hotel") %>% 
  distinct(location) %>%
  rename(label = location)
cc_nodes <- full_join(sources, 
                      destinations, 
                      by = "label") %>% rename(car_id=id)
cc_nodes <- cc_nodes %>% 
  rowid_to_column("id") %>%
  mutate(CurrentEmploymentType=ifelse(is.na(CurrentEmploymentType),
                                      "Locations",CurrentEmploymentType),
         title=label) %>% 
  rename(group=CurrentEmploymentType)
edges <- cc_data %>% 
  mutate(last4ccnum = as.character(last4ccnum)) %>%  
  filter(last4ccnum %in% final_tagging$last4ccnum) %>% 
  group_by(last4ccnum, location, day, hour) %>%
  summarise(weight = n()) %>% 
  ungroup()
cc_edges <- edges %>% 
  inner_join(cc_nodes,by = c("last4ccnum")) %>% 
  rename(from = id)
cc_edges <- cc_edges %>% 
  inner_join(cc_nodes,by = c("location" = "label")) %>% 
  rename(to = id) %>% 
  dplyr::select(from, to,day, hour, weight) %>% 
  mutate(time_bin = case_when(hour>=0&hour<6~"Midnight",
                              hour>=6&hour<12~"Morning",
                              hour>=12&hour<18~"Afternoon",
                              hour>=18~"Night"),
         weekday.weekend = ifelse(day %in% c(11,12,18,19),"Weekend","Weekday"),
         day.week = case_when(day==6|day==13~"Monday",
                              day==7|day==14~"Tuesday",
                              day==8|day==15~"Wednesday",
                              day==9|day==16~"Thursday",
                              day==10|day==17~"Friday",
                              day==11|day==18~"Saturday",
                              day==12|day==19~"Sunday",))
cc_edges_dn<- cc_edges %>% 
  filter(time_bin=="Night", weekday.weekend=="Weekday") %>% 
  mutate(color=rainbow(max(day))[day])
# cc_edges_dn$color <- palette(rainbow(7))[cc_edges_dn$day]
visNetwork(cc_nodes, cc_edges_dn, 
           main="Network analysis by location and employee") %>% 
  visIgraphLayout(layout = "layout_on_grid") %>% 
  visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE) %>%
  visEdges(smooth=FALSE, color="color") %>% 
  visLegend()

Figure 9: Network analysis on Weekday Night

  1. Employee Baza Isak and Calixto Nils patronised Ouzeri Elian on several night at the same time. From table 10, we observe that on 08/01 and 16/1, Baza Isak and Calixto Nils transaction timing were only 1 minute apart and on 17/1, both had transactions in the evening. A probable deduction could be that they are of good friends since they are in the same department who hang out and have dinner together after working hours.
final_cc <- cc %>% mutate(left4ccnum=as.character(last4ccnum)) %>% 
  left_join(final_tagging, by="last4ccnum") %>% 
  mutate(day=lubridate::day(datetime), hour=lubridate::hour(datetime),
         time_bin = case_when(hour>=0&hour<6~"Midnight",
                              hour>=6&hour<12~"Morning",
                              hour>=12&hour<18~"Afternoon",
                              hour>=18~"Night"),
         weekday.weekend = ifelse(day %in% c(11,12,18,19),"Weekend","Weekday"),
         day.week = case_when(day==6|day==13~"Monday",
                              day==7|day==14~"Tuesday",
                              day==8|day==15~"Wednesday",
                              day==9|day==16~"Thursday",
                              day==10|day==17~"Friday",
                              day==11|day==18~"Saturday",
                              day==12|day==19~"Sunday",))
knitr::kable(final_cc %>% 
             filter(weekday.weekend=="Weekday"&time_bin=="Night") %>% 
             filter(location =="Ouzeri Elian"&(name=="Baza Isak"|name=="Calixto Nils")) %>% 
             select(location, datetime, name, price, CurrentEmploymentType,CurrentEmploymentTitle)
               , "simple",
      caption="Baza Isak and Calixto Nils transactions at Ouzeri Elian on Weekdays Nights")
Table 6: Baza Isak and Calixto Nils transactions at Ouzeri Elian on Weekdays Nights
location datetime name price CurrentEmploymentType CurrentEmploymentTitle
Ouzeri Elian 2014-01-08 21:16:00 Calixto Nils 30.81 Information Technology IT Helpdesk
Ouzeri Elian 2014-01-08 21:17:00 Baza Isak 29.85 Information Technology IT Technician
Ouzeri Elian 2014-01-09 19:42:00 Baza Isak 27.08 Information Technology IT Technician
Ouzeri Elian 2014-01-10 18:52:00 Baza Isak 19.92 Information Technology IT Technician
Ouzeri Elian 2014-01-13 19:30:00 Calixto Nils 28.75 Information Technology IT Helpdesk
Ouzeri Elian 2014-01-14 20:32:00 Baza Isak 11.86 Information Technology IT Technician
Ouzeri Elian 2014-01-15 20:29:00 Baza Isak 23.18 Information Technology IT Technician
Ouzeri Elian 2014-01-16 20:25:00 Baza Isak 23.89 Information Technology IT Technician
Ouzeri Elian 2014-01-16 20:28:00 Calixto Nils 9.91 Information Technology IT Helpdesk
Ouzeri Elian 2014-01-17 19:40:00 Baza Isak 38.60 Information Technology IT Technician
Ouzeri Elian 2014-01-17 20:28:00 Calixto Nils 35.81 Information Technology IT Helpdesk

Apart from the transactional data performed by employees, we will look into the GPS data to observe for any gathering and potential relationships. Figure 10 shows every employee car GPS stationary coordinates.

gps_stop_points <- gps_pts %>%
  mutate(time.stop = difftime(next.start.time, end.time,units=c("mins")), 
         time.stop = as.numeric(time.stop),
         time.location = difftime(next.start.time,end.time),
         time.location = as.numeric(time.location),
         name=paste(LastName,FirstName),
         id=as.character(id),id=as.numeric(id),
         gps.coord=end.gps)%>% 
  filter(id<100 ) %>%
  dplyr::select(name, CurrentEmploymentType,CurrentEmploymentTitle,
                end.time, end.gps,next.start.time,time.location) %>% 
  rename(Arrival.Time=end.time, Coordinate=end.gps,
         Next_move_off_time=next.start.time,Time_at_location=time.location)

## Plot interactive map
tmap_mode("view")
fullmap<-tm_shape(bgmap) +
  tm_rgb(bgmap, r=1, g=2, b=3, alpha=NA, saturation=1, 
         interpolate=TRUE, max.value=255) +
  tm_shape(gps_stop_points)+
  tm_markers()

tmap_leaflet(fullmap)

Figure 10: Stationary GPS points of all cars

  1. Hovering around the location in between Arkadiou Park and Sannan Park with coordinates (24.89, 36.06) reveals 57 GPS stationary coordinates at that location. The 57 GPS points belongs to Dedos Lidelse, Osvaldo Hennie and Frente Birgitta cars. From the GPS timestamp in table 11, Dedos Lidelse car stops at the location overnight daily. We can deduce that the location is likely the home of Dedos Lidelse. Hence, the alluvial diagram in figure 11 was used to visualise the time spent at Dedos Lidelse house for the three employees. We can observe some trends based on the time spent at the location.
knitr::kable(gps_stop_points %>% st_drop_geometry() %>% 
               filter(name=="Dedos Lidelse", Time_at_location>450) %>% 
               select(name, Arrival.Time, Coordinate, 
                      Next_move_off_time, Time_at_location),
             "simple",
             caption="Table of transaction at Chostus Hotel")
Table 7: Table of transaction at Chostus Hotel
name Arrival.Time Coordinate Next_move_off_time Time_at_location
Dedos Lidelse 2014-01-06 08:13:09 POINT (24.87957 36.04803) 2014-01-06 12:24:01 15052
Dedos Lidelse 2014-01-06 12:27:23 POINT (24.89995 36.05454) 2014-01-06 13:58:01 5438
Dedos Lidelse 2014-01-06 14:01:23 POINT (24.87957 36.04803) 2014-01-06 17:32:01 12638
Dedos Lidelse 2014-01-06 17:37:39 POINT (24.89608 36.06333) 2014-01-06 19:01:01 5002
Dedos Lidelse 2014-01-06 19:04:05 POINT (24.89993 36.05448) 2014-01-06 20:13:01 4136
Dedos Lidelse 2014-01-06 20:18:01 POINT (24.89612 36.06343) 2014-01-07 07:01:01 38580
Dedos Lidelse 2014-01-07 07:04:46 POINT (24.88594 36.0637) 2014-01-07 07:43:01 2295
Dedos Lidelse 2014-01-07 07:50:09 POINT (24.87957 36.04803) 2014-01-07 12:12:01 15712
Dedos Lidelse 2014-01-07 12:18:16 POINT (24.85805 36.05974) 2014-01-07 14:00:01 6105
Dedos Lidelse 2014-01-07 14:06:16 POINT (24.87957 36.04803) 2014-01-07 17:40:01 12825
Dedos Lidelse 2014-01-07 17:45:39 POINT (24.89613 36.06337) 2014-01-07 18:47:01 3682
Dedos Lidelse 2014-01-07 18:50:05 POINT (24.89997 36.0545) 2014-01-07 20:17:01 5216
Dedos Lidelse 2014-01-07 20:20:05 POINT (24.89616 36.06332) 2014-01-08 07:12:01 39116
Dedos Lidelse 2014-01-08 07:15:46 POINT (24.88589 36.06366) 2014-01-08 07:48:01 1935
Dedos Lidelse 2014-01-08 07:55:09 POINT (24.87957 36.04803) 2014-01-08 12:25:01 16192
Dedos Lidelse 2014-01-08 12:28:42 POINT (24.90247 36.05584) 2014-01-08 13:58:01 5359
Dedos Lidelse 2014-01-08 14:01:42 POINT (24.87957 36.04803) 2014-01-08 17:43:01 13279
Dedos Lidelse 2014-01-08 17:48:39 POINT (24.89616 36.06341) 2014-01-08 19:47:01 7102
Dedos Lidelse 2014-01-08 19:52:01 POINT (24.89989 36.05452) 2014-01-08 21:17:01 5100
Dedos Lidelse 2014-01-08 21:22:01 POINT (24.89617 36.0634) 2014-01-09 07:17:01 35700
Dedos Lidelse 2014-01-09 07:20:46 POINT (24.88587 36.06365) 2014-01-09 07:59:01 2295
Dedos Lidelse 2014-01-09 08:06:09 POINT (24.87957 36.04803) 2014-01-09 12:00:01 14032
Dedos Lidelse 2014-01-09 12:08:17 POINT (24.85103 36.06349) 2014-01-09 13:50:01 6104
Dedos Lidelse 2014-01-09 13:58:17 POINT (24.87957 36.04802) 2014-01-09 17:34:01 12944
Dedos Lidelse 2014-01-09 17:39:39 POINT (24.89615 36.06342) 2014-01-10 07:08:01 48502
Dedos Lidelse 2014-01-10 07:11:46 POINT (24.8859 36.06365) 2014-01-10 07:44:01 1935
Dedos Lidelse 2014-01-10 07:51:09 POINT (24.87958 36.04802) 2014-01-10 12:19:01 16072
Dedos Lidelse 2014-01-10 12:27:17 POINT (24.851 36.06342) 2014-01-10 14:08:01 6044
Dedos Lidelse 2014-01-10 14:16:17 POINT (24.87957 36.04802) 2014-01-10 17:49:01 12764
Dedos Lidelse 2014-01-10 17:54:39 POINT (24.89615 36.06334) 2014-01-10 18:50:01 3322
Dedos Lidelse 2014-01-10 18:59:30 POINT (24.86038 36.08545) 2014-01-10 23:30:01 16231
Dedos Lidelse 2014-01-10 23:39:30 POINT (24.89611 36.0634) 2014-01-11 18:51:01 69091
Dedos Lidelse 2014-01-11 18:54:05 POINT (24.8999 36.05446) 2014-01-11 20:54:01 7196
Dedos Lidelse 2014-01-11 20:57:00 POINT (24.89635 36.06331) 2014-01-12 12:30:01 55981
Dedos Lidelse 2014-01-12 12:38:28 POINT (24.85762 36.07668) 2014-01-12 14:07:01 5313
Dedos Lidelse 2014-01-12 14:15:28 POINT (24.89613 36.06335) 2014-01-12 18:24:01 14913
Dedos Lidelse 2014-01-12 18:27:05 POINT (24.89991 36.05447) 2014-01-12 20:53:01 8756
Dedos Lidelse 2014-01-12 20:56:03 POINT (24.89614 36.06337) 2014-01-13 07:05:01 36538
Dedos Lidelse 2014-01-13 07:23:06 POINT (24.85092 36.08183) 2014-01-13 07:52:01 1735
Dedos Lidelse 2014-01-13 08:12:11 POINT (24.87958 36.04802) 2014-01-13 12:25:01 15170
Dedos Lidelse 2014-01-13 12:28:23 POINT (24.89994 36.05446) 2014-01-13 13:43:01 4478
Dedos Lidelse 2014-01-13 13:46:23 POINT (24.87957 36.04802) 2014-01-13 17:48:01 14498
Dedos Lidelse 2014-01-13 17:53:39 POINT (24.89612 36.06332) 2014-01-13 19:00:01 3982
Dedos Lidelse 2014-01-13 19:05:01 POINT (24.89998 36.05449) 2014-01-13 21:04:01 7140
Dedos Lidelse 2014-01-13 21:09:01 POINT (24.89612 36.06339) 2014-01-14 07:43:01 38040
Dedos Lidelse 2014-01-14 07:46:46 POINT (24.88589 36.06364) 2014-01-14 07:57:01 615
Dedos Lidelse 2014-01-14 08:04:09 POINT (24.87957 36.04803) 2014-01-14 12:05:01 14452
Dedos Lidelse 2014-01-14 12:08:42 POINT (24.90256 36.05573) 2014-01-14 13:50:01 6079
Dedos Lidelse 2014-01-14 13:53:42 POINT (24.87958 36.04802) 2014-01-14 17:41:01 13639
Dedos Lidelse 2014-01-14 17:46:39 POINT (24.89607 36.06341) 2014-01-14 19:14:01 5242
Dedos Lidelse 2014-01-14 19:17:05 POINT (24.89994 36.05452) 2014-01-14 20:30:01 4376
Dedos Lidelse 2014-01-14 20:36:01 POINT (24.8961 36.06338) 2014-01-15 07:41:01 39900
Dedos Lidelse 2014-01-15 07:50:52 POINT (24.87958 36.04803) 2014-01-15 12:03:01 15129
Dedos Lidelse 2014-01-15 12:11:26 POINT (24.85237 36.06582) 2014-01-15 13:42:01 5435
Dedos Lidelse 2014-01-15 13:50:26 POINT (24.87957 36.04803) 2014-01-15 17:48:01 14255
Dedos Lidelse 2014-01-15 17:53:39 POINT (24.89614 36.06337) 2014-01-15 18:49:01 3322
Dedos Lidelse 2014-01-15 18:52:08 POINT (24.90177 36.05501) 2014-01-15 20:37:01 6293
Dedos Lidelse 2014-01-15 20:40:08 POINT (24.8961 36.06341) 2014-01-16 07:13:01 37973
Dedos Lidelse 2014-01-16 07:16:46 POINT (24.88592 36.06365) 2014-01-16 07:55:01 2295
Dedos Lidelse 2014-01-16 08:02:09 POINT (24.87957 36.04803) 2014-01-16 12:17:01 15292
Dedos Lidelse 2014-01-16 12:23:16 POINT (24.85804 36.05971) 2014-01-16 13:51:01 5265
Dedos Lidelse 2014-01-16 13:57:16 POINT (24.87957 36.04803) 2014-01-16 17:38:01 13245
Dedos Lidelse 2014-01-16 17:43:39 POINT (24.89612 36.06333) 2014-01-16 19:01:01 4642
Dedos Lidelse 2014-01-16 19:04:08 POINT (24.9018 36.05496) 2014-01-16 19:46:01 2513
Dedos Lidelse 2014-01-16 19:49:08 POINT (24.8961 36.06338) 2014-01-17 07:28:01 41933
Dedos Lidelse 2014-01-17 07:31:46 POINT (24.8859 36.06365) 2014-01-17 08:00:01 1695
Dedos Lidelse 2014-01-17 08:07:09 POINT (24.87957 36.04802) 2014-01-17 11:56:01 13732
Dedos Lidelse 2014-01-17 12:04:26 POINT (24.85237 36.06584) 2014-01-17 13:59:01 6875
Dedos Lidelse 2014-01-17 14:07:26 POINT (24.87957 36.04803) 2014-01-17 17:40:01 12755
Dedos Lidelse 2014-01-17 17:45:39 POINT (24.89608 36.06337) 2014-01-18 12:38:01 67942
Dedos Lidelse 2014-01-18 12:40:56 POINT (24.90247 36.05582) 2014-01-18 13:31:01 3005
Dedos Lidelse 2014-01-18 13:31:14 POINT (24.90183 36.05503) 2014-01-18 15:36:01 7487
Dedos Lidelse 2014-01-18 15:39:45 POINT (24.89617 36.06341) 2014-01-18 18:21:01 9676
Dedos Lidelse 2014-01-18 18:24:05 POINT (24.89991 36.05443) 2014-01-18 19:37:01 4376
Dedos Lidelse 2014-01-18 19:40:05 POINT (24.89612 36.0634) 2014-01-19 18:25:01 81896
Dedos Lidelse 2014-01-19 18:28:05 POINT (24.8999 36.05449) 2014-01-19 20:06:01 5876

1.1 Frente Birgitta and Osvaldo Hennie often arrive at the location around 1700 hrs and leave at 1900 hrs on weekdays only.

1.2 Frente Birgitta would often drop by the location twice a days. On those days, Frente Birgitta would arrive around 1700hrs and leave at 1900hrs, similar like above and return subsequently to the location after 2000hrs and leave the following morning.

1.3 Osvaldo Hennie only stay overnight at that location 5 times over this period.

Probable deduction is that they were having dinner together at Dedos Lidelse house. An unofficial relationship might exist between Frente Birgitta and Dedos Lidelse. Furthermore, both employees are from the engineering department which might further support the deduction.

denos_loc<-st_set_crs(st_sfc(st_point(c(24.89612,36.06343))),4326)
denos_home<-gps_stop_points %>% 
  mutate(dist_denos = st_distance(Coordinate, denos_loc),
         dist_denos=as.numeric(dist_denos)) %>% 
  filter(dist_denos<50) %>% 
  mutate(arrival.date=lubridate::date(Arrival.Time),
         arrival.hour=lubridate::hour(Arrival.Time),
         departure.date=lubridate::date(Next_move_off_time),
         departure.hour=lubridate::hour(Next_move_off_time),
         name2=name) %>% 
  filter(arrival.hour>16) %>% 
  select(name2, name,arrival.date, departure.date,arrival.hour,departure.hour)%>% 
  st_drop_geometry() %>% to_lodes_form(denos_home, key="Variables",axes=2:6)

ggplot(denos_home, aes(x=Variables,stratum=stratum,alluvium=alluvium))+
  geom_alluvium(aes(fill=name2),discern=FALSE)+
  geom_stratum(width=1/3,alpha=.2,discern=FALSE)+
  geom_label(stat="stratum",size=2,aes(label=after_stat(stratum)))+
  theme(axis.text.y=element_blank(),
        axis.title.x=element_blank(),
        axis.ticks.y=element_blank(),
        axis.text.x = element_text(size=8),
        legend.position="none")+
  labs(fill="Name")
Alluvial Diagram of time spent at Dedos Lidelse house

Figure 11: Alluvial Diagram of time spent at Dedos Lidelse house